Network coding construction for a special class of three unicast sessions

For a special class of three unicast sessions, in which the maximum flow from each sender to each receiver is the same positive integer k, a network coding approach is proposed. A multigeneration mixed strategy, in which (2 × n + 1) consecutive generations are taken as a mixed set, is adopted. The precoding strategy is adopted at the senders, random linear network coding technology is adopted at intermediate nodes to transmit data, and interference alignment technology is used at the receivers to eliminate some interference information. The proposed approach can construct a network coding data transmission scheme, in which the transmission rate vector of the senders is [(n + 1) × k/(2 × n + 1), n × k/(2 × n + 1), n × k/(2 × n + 1)]. The feasibility of the proposed approach is mathematically deduced and proven, and the simulation results verify the conclusions of the theoretical analysis.


Network coding construction for a special class of three unicast sessions
Baoxing PU For a special class of three unicast sessions, in which the maximum flow from each sender to each receiver is the same positive integer k, a network coding approach is proposed.A multigeneration mixed strategy, in which (2 × n + 1) consecutive generations are taken as a mixed set, is adopted.The precoding strategy is adopted at the senders, random linear network coding technology is adopted at intermediate nodes to transmit data, and interference alignment technology is used at the receivers to eliminate some interference information.The proposed approach can construct a network coding data transmission scheme, in which the transmission rate vector of the senders is [(n + 1) × k/(2 × n + 1), n × k/(2 × n + 1), n × k/(2 × n + 1)].The feasibility of the proposed approach is mathematically deduced and proven, and the simulation results verify the conclusions of the theoretical analysis.
Network coding 1 is a new data transmission technology that was initially studied for single sender multicast networks and has achieved good research results in this field [2][3][4][5] .
Compared to routing transmission technology, network coding allows network nodes to encode the received data before transmitting it to the output channel.Network coding has advantages in improving network throughput, reducing network energy consumption, improving network robustness and security 6 .Since its inception, this technology has been extensively studied and has been proven to be applicable in multiple fields, such as single source multicast, multisource multicast, wireless networks, large capacity file distribution, cloud storage, and the Internet of Things [7][8][9][10][11] .Reference 12 points out that network coding is a key technology to meet the growing demand of future networks.
For widely used linear network coding, the output channel of a node transmits the encoding of the node's input data.This encoding is a linear combination of input data, and the coefficients of this linear combination are called local encoding vector.Therefore, each output channel corresponds to a local encoding vector.The local encoding vectors of all output channels of all nodes form a network coding data transmission scheme.Before using network encoding for data transmission, it is necessary to design a local encoding vector for each output channel of each node, which is called constructing a network coding data transmission scheme 4,5 .
In practical applications, most communication networks appear in the form of multiple senders.Driven by application requirements, scholars have gradually turned to the study of multisender network coding 8 .However, the construction of multisender network coding is a very difficult problem.To date, although some research results have been obtained, no substantive breakthroughs have been made.Among them, there is a special multisender network coding problem named multiunicast network coding 7 , which is relatively simple and accounts for a large proportion of practical applications.The research in Ref. 13 showed that any directed acyclic network (including a multisender network) can construct a corresponding multiunicast network, and the original network and the multiunicast network have the same network coding solvability.In other words, if there is a feasible linear network coding data transmission scheme in the constructed multiunicast network, there must be a feasible linear network coding data transmission scheme in the original network.Therefore, multiunicast network coding research, on the one hand, is necessary for network coding technology applied to multiunicast networks; on the other hand, it also provides an effective way to solve the general multisender network coding problem.As a result, research on multiunicast network coding has attracted the attention of the academic community.
In Ref. 14 , it was noted that it is also quite difficult to solve the multiunicast network coding problem.Even the simplest dual unicast network coding is an NP problem.In practical applications, if network coding technology is used to realize data transmission in a multiunicast network, then constructing a network coding data transmission scheme is an essential step.Therefore, the construction of multiunicast network coding is a research hotspot.
In view of the difficulty of multiunicast network coding, most of the existing research work only involves the case of few senders (dual unicast network and three unicast networks).Through research on special and simple problems, researchers have attempted to solve general problems [15][16][17] .For existing research on three-unicast network coding, some researchers adopted heuristic algorithms, and others only focused on some special network topologies.In Ref. 15 , for a special class of intersession networks formed by the cascade of butterfly networks, a construction method of a network coding scheme with better throughput was proposed by using a random linear network coding strategy and combining an evolutionary computation method.In Ref. 16 , the conditions that the three unicast networks with the connecting level vector (k 1 , k 2 , k 3 ) needs to meet when using network coding to achieve the unit data transmission rate were analyzed.Except for three unicast networks with a connectivity level vector (2, 2, 4), all types of three unicast networks that meet the constraint condition min 1≤i≤3 {k i } ≤ 3 were analyzed.
For each type, either a network coding data transmission scheme implementing the unit data transmission rate or a counter example that cannot be achieved was given.In Ref. 17 , the degree of freedom of wireless networks with channel interference was studied, and the technology of implementing precoding at the senders and using interference alignment at the receivers to eliminate interference data were proposed.The research idea in Ref. 17 was drawn upon in Ref. 18 , a special class of three unicast session networks was studied, and a construction method of network coding was proposed.In the three unicast sessions they studied, the maximum flow from each sender to each receiver was one.In this study, a precoding strategy was adopted at the senders, interference alignment technology was used at the receivers, the graph theory method and the algebraic method were used to describe various network topology characteristics, which were summed as the network coupling relationship, and network coding construction in different situations was realized according to the network topology coupling relationship.
Inspired by the research in Ref. 18 , in this study, another special class of three unicast sessions, in which the maximum flow from each sender to each receiver is the same positive integer k, and k ≥ 2, is investigated.Through analysis and derivation, a feasible network coding construction algorithm is proposed, and it extends the application scope of Ref. 18 .The basic idea is to adopt a multigeneration mixed strategy and select consecutive 2 × n + 1 (n is a positive integer) generations as a mixed set.In each generation, a precoding strategy is adopted at the senders, and the intermediate nodes implement random linear network coding technology to transmit data.The receivers accept the data of each generation and store them.When the transmission of a mixed set is completed, each receiver combines the data collected by each generation to generate a linear equations system.At each receiver, with the help of the precoding matrices of the senders, interference alignment technology is adopted to eliminate part of the interference information so that the linear equations system can be solved.The proposed algorithm can construct a feasible linear network coding scheme, in which the transmission rate vector of the senders is When n is sufficiently large, the transmission rate vector asymptotically reaches (k/2, k/2, k/2).The feasibility and correctness of the algorithm are theoretically deduced and mathematically proven.The simulation results verify the conclusion of the theoretical analysis.

Working mechanism of network coding for three unicast sessions
According to the description in Ref. 18 , a network can be represented by a directed acyclic graph G = (V, E), where V is the node set and E the directed edge set.The network we are considering has three unicast sessions.The rth (r = 1, 2, 3) unicast session is represented by a tuple w r = (s r , d r , X r ), where s r and d r are the sender and the receiver of the rth unicast session, respectively.X r = x r,1 , x r,2 , .., x r,h r T is a vector of independent random variables, each of which represents a packet that s r sends to d r in a generation (also referred to as a time slot), where x r,j ∈ GF (2 m ) (r = 1, 2, 3 and j = 1, 2, …, h r ).The network we consider is further restricted.Specifically, the maximum flow from each sender to each receiver is the same positive integer k. Figure 1 shows an example of such a network.It is not difficult to verify from the figure that there are three unicast sessions in the network.The maximum flow from each sender to each receiver is 2.
For this network, we intend to adopt a random linear network coding strategy to transmit data.Each node (including the sender) performs a random linear network coding operation on the characters received from its input channels.For each channel e i ∈ E, tail (e i ) is the tail node of channel e i , head (e i ) is the head node of channel e i , and the characters transmitted on the channel are represented as y e i .Then, according to the linear network coding principle described in Ref. 3 , the character transmitted by channel e i can be calculated by Formula (1).
where {f j,i |head(e j ) = tail(e i )} represents the local coding vector of channel e i .According to the topological characteristics of the directed acyclic graph, through recursive and iterative operations, the character transmitted by each channel can be expressed as the linear combination of the characters sent by each sender, expressed by Formula (2).
(Γ e,1 , Γ e,2 , Γ e,3 ) is the global coding vector of channel e i .For the random linear network coding strategy, each intermediate node should not only transmit the encoded characters to the output channel but also transmit the global coding vector. (1) Due to the restriction of the maximum flow, each receiver has at least k input channels.The receiver d i (1 ≤ i ≤ 3) collects the global coding vector, carries characters from its input channels and forms a linear equation system according to the corresponding relationship of Formula (2).The linear equation system can be shown by Formula (3).
where H j,i is the transmission gain matrix from s j to d i , which is only related to the transmission path from s j to d i .Note that Formula (3) is a representation of the information received by d i (1 ≤ i ≤ 3) in a generation.
The following illustrates the fact that the rank of matrix H j,i will be equal to the maximum flow from s j to d i with probability close to 1.
Theorem 1 In a multiunicast network with three unicast sessions, ⑴ the maximum flow from any sender s j (j = 1, 2, 3) to any receiver d i (i = 1, 2, 3) is the same positive integer k. ⑵ The selected finite field GF (2 m ) has a sufficiently large order.⑶ The transmission rate of each sender is set to k (each sender sends k characters to the network in a generation).(4) The random linear network coding strategy is adopted to select the local coding vector of each channel (as shown in Formula (1)) to form a network coding data transmission scheme.Then, under the constructed transmission scheme, the ith receiver obtains a linear equation system as shown in Formula (3), and the rank of H j, i in the linear equation system will reach k with a probability close to 1.
Proof After the local coding vectors of each channel are selected by a random linear network coding strategy on the given network, a network coding data transmission scheme is obtained.According to the transmission scheme, the linear equation system shown in Formula (3) is obtained for receiver d i .For a sender s j (j = 1, 2, 3), based on the transmission scheme obtained above, a new network coding data transmission scheme is constructed.In the new scheme, the local coding vector of a channel in the network is set as follows: the local coding vector of the output channel of the sender s k (k ≠ j) is set to a zero vector, while the local coding vector of the other channels is the same as that in the original scheme.Then, for the new transmission scheme, the linear equation system obtained by receiver d i must be shown in Formula (4).
Comparing Formulas (3) and ( 4), the relevant items of the information sent by s j in Formula (3) are retained in Formula (4).However, in the new transmission scheme, the local coding vector of the output channel of the other senders is set to a zero vector, so the items associated with the other senders will not appear.On the other hand, the new transmission scheme is equivalent to transmitting data with random linear network coding from s j to the receivers, while the other senders do not transmit data, and H j, i in Formula ( 4) is the global coding (3)

Multigeneration mixed strategy
In the process of network coding data transmission, in a generation, after the senders send data, the receivers can receive the coding of the data.There are two ways to decode for the receiver 19 .First, in a generation, each receiver combines the received information to form a linear equations system and immediately solves the linear equations system to recover the sender's data.This method is referred to as the generation-by-generation strategy.Second, if all receivers have a function to store data, then multiple consecutive generations are taken as the decoding unit, in which the receivers store the received information of each generation.When the data transmission in a mixed set is completed, the receivers combine the information received by multiple generations to form a linear equation system and uniformly solve the linear equation system to recover the message of the senders.This method is referred to as the multigeneration mixed strategy, and when multiple consecutive generations are the decoding unit, it is referred to as a mixed set.

Contributions of this paper
For a special class of three unicast sessions, where the maximum flow from each sender to each receiver is one, the network coding construction algorithm is given in Ref. 18 .Based on their research results, we expand the research scope.Our research object is another special class of three unicast sessions, where the maximum flow from each sender to each receiver is the same positive integer k, and k ≥ 2. We adopt some of the technology adopted in Ref. 18 , but the main difference is that we use the property of the block matrix and use the Schwarz-Zippel theorem 20 to prove that the coefficient matrix of the linear equation system is invertible with a probability close to 1.
It is worth noting that the three unicast sessions we are considering cannot be efficiently solved by the method proposed in Ref. 18 .If a virtual sender is added for each sender and a virtual channel with unit capacity is added to connect a virtual sender to its corresponding sender, a new three unicast session is formed, in which the maximum flow from each virtual sender to each receiver is one.The modified three unicast sessions can be solved by using the method in Ref. 18 , but it obviously wastes network transmission resources because the asymptotic data transmission rate vector of the obtained network coding scheme is only (1/2, 1/2, 1/2), whereas the asymptotic data transmission rate vector of the network coding scheme constructed by the proposed method in this paper can reach (k/2, k/2, k/2).

Analysis and derivation
Because the maximum flow of each sender-receiver pair is the same positive integer k, according to the maximum flow minimum cut theorem, the transmission rate of each sender cannot exceed k.After selecting the finite field GF (2 m ), each sender can send at most k characters of GF (2 m ) in a generation.However, due to interference in the data transmission process, it is difficult for the data transmission rate to reach k.We adopt the multigeneration mixed strategy.First, we select a positive integer n.For convenience of writing, remember that p = 2 × n + 1, q = n × k, and l = p × k.Take p consecutive generations as a mixed set.In a mixed set, senders s 1 , s 2 and s 3 need to send q + k, q and q characters to the network, respectively.The characters to be transmitted from s 1 , s 2 and s 3 in the mixed set are formed into three vectors, which are recorded as X 1 , X 2 and X 3 , respectively, and further written in the following form: At the initial time of each mixed set, each sender s i (i = 1,2,3) implements a precoding strategy, that is, according to the encoding rule of Eq. ( 5), s i obtains l characters encoded from X i .The encoded l characters form a vector that is recorded as Xi = xi,1 , xi,2 , ..., xi,l T .The precoding rule is shown in Formula ( 5): where i = 1, 2, …, l, j = 1, 2, …, h i , and v i,j,r ∈ GF(2 m ).Formula ( 5) is rewritten in the form of a matrix product to obtain Formula (6), where V i is shown in Formula (7).
( www.nature.com/scientificreports/V i is the precoding matrix of sender s i .Note that h 1 = q + k and h 2 = h 3 = q, so V 1 is a matrix with l rows and (q + k) columns, while V 2 and V 3 are matrices with l rows and q columns.
In Formula (6), Xi is a vector composed of l characters.If these characters are divided into p groups evenly, then each group contains k characters.Furthermore, if each group is regarded as a vector and recorded as X[t] i (t = 1, 2, …, p), then Formula ( 6) can be expressed in the form of Formula (8).
A mixed set has p generations in which each sender needs to transmit l encoded characters.Therefore, in each generation, each sender should transmit k encoded characters.In the tth (1 ≤ t ≤ p) generation of the mixed set, s i needs to transmit k characters represented by X[t] i ; intermediate nodes of the network adopt a random linear network coding strategy for data transmission.The receiver d i (i = 1, 2, 3) receives the global coding vector and the carried characters from k input channels and correlates both according to the rules shown in Formula (3).Then, a linear equation system with k equations is obtained.If it is written in matrix form, Formula ( 9) is obtained: where Y [t] i = ŷi,(t−1)k+1 , ŷi,(t−1)k+2 , ..., ŷi,tk T is a vector composed of k characters and M [t] j,i replaces H j,i in Formula (3).M [t] j,i is the transmission gain matrix from s j to d i in the tth generation.Formula ( 9) represents the information received by d i (i = 1, 2, 3) in the tth generation and contains k equations.When the data transmission of a mixed set is completed, receiver d i can combine these equations obtained in the mixed set to form a linear equation system with p × k equations, as shown in Formula (10).Formula ( 10) is written in the form of a matrix to obtain: where where i = 1, 2, 3 and j = 1, 2, 3. Replacing X i in Eq. ( 11) with the right side of Eq. ( 6), we obtain In Formula (13), M j,i is a block diagonal square matrix of order l, and its main diagonal is composed of p square matrices of order k, which are, respectively expressed as M [t] j,i (t = 1, 2, …, p).The remaining bold "0" represents the k-order zero matrix.
Theorem 2 If the maximum flow from each sender to each receiver is the same positive integer k, a random linear network coding strategy is adopted to transmit data in each generation of a mixed set, and the order of the finite field GF (2 m ) is sufficiently large, then the matrix M j,i in Formula (13) will be reversible with a probability close to 1. Proof We note that the maximum flow from s j to d i is k, and s j transmits k encoded characters to the network in each generation.The intermediate node adopts a random linear network coding strategy to determine the local coding vector of its output channel.In the tth (t = 1,2…, p) generation, the transmission gain matrix from s j to d i (7) 3 Y [2] i = M [2] 1,i 2 +M [2] 3,i

X[p]
2 +M [2] 3,i www.nature.com/scientificreports/ is M [t] j,i , which is obtained by receiver d i .According to theorem 1, when the order of the selected finite field GF (2 m ) is sufficiently large, the rank of M [t] j,i will reach k with a probability close to 1.M [t] j,i is a k-order square matrix; therefore, the determinant |M [t] j,i | will not be equal to zero with a probability close to 1. On the other hand,M j,i is a block diagonal square matrix.According to the properties of the block diagonal matrix 16 , the determinant of M j,i can be expressed as follows: Formula (15) means that the determinant of the block diagonal square matrix is equal to the product of the determinant of each block square matrix on the main diagonal, so |M j,i | will not be zero with a probability close to 1, that is, M j,i will be reversible with a probability close to 1.
According to the definition of three unicast sessions, receiver d i only needs to obtain the information vector X i sent by s i , and the linear equation system obtained by d i is shown in Formula (14).From Formula ( 14), it can be seen that the linear equations system has l = (2 × n + 1) × k equations but contains messages from three senders, which are represented as unknown elements in the linear equations system.The number of unknown elements reaches (3n + 1) × k.Obviously, the number of unknown elements is greater than the number of equations.Therefore, some of the unknown elements should be eliminated for the linear equation system shown in Formula (14).In this paper, the interference alignment strategy proposed in Ref. 17 is adopted to eliminate some of the unknown elements.
After the data transmission of a mixed set is completed, the linear equation system obtained by each sender is as follows: Formula ( 16) represents the linear equation system obtained by d 1 .d 1 needs to obtain X 1 by solving the linear equations system.To adopt the interference alignment strategy, we force If Formula (19) holds, Formula (16) can be written as Formula (20).
where M 1,1 V 1 M 3,1 V 3 is the coefficient matrix of the linear equation system in Formula (21).For Formula (21), the number of unknown elements is reduced to l, which makes the number of unknown elements equal to the number of equations.Furthermore, Formula (21) contains the message variable X 1 required by d 1 .
where M 1,2 V 1 M 2,2 V 2 is the coefficient matrix of the linear equation system.For Formula (24), the number of unknown elements is reduced to l, which makes the number of unknown elements equal to the number of equations.Furthermore, Formula (24) contains the message variable X 2 required by d 2 .
Formula (18) represents the linear equation system obtained by d 3 .Using interference alignment technology, we force the following: where B is a matrix of (q + k) rows and q columns.B is obtained by removing the leftmost k columns from the identity matrix of order (q + k).Combining Formula (25) and Formula ( 18), we obtain: The identity matrix of order (q + k) is represented as following.
According to the definition of matrix B, B is represented as following.
Then, BX 2 can be represented as following.
where M 1,3 V 1 M 3,3 V 3 is the coefficient matrix.For Formula (27), the number of unknown elements is reduced to l.Furthermore, Formula (27) contains the message variable X 3 required by d 3 .
The key is now how to determine the elements of V 1 , V 2 and V 3 so that Formulas (20), ( 23) and (26) can be established simultaneously.
According to theorem 2, M j,i is reversible with a large probability, and the inverse matrix of M j,i is written as M −1 j,i .From Formula (19), we have From Formula (23), we have We combine Formula (29) and Formula (28) to obtain: From Formula (25), we have We combine Formula (30) and Formula (31) to obtain: We multiply the matrix M −1 1,3 M 2,3 on both sides of Formula (32) to the left and obtain

Let
Then, Formula (33) becomes the following form: www.nature.com/scientificreports/ The three block matrices on the main diagonal of M 3,3 are: The matrix T is calculated according to Formula (34), and the three block matrices on the main diagonal of T are: The elements in W are randomly selected.W is a matrix of 6 rows and 2 columns.The elements of W are as follows: V * 1 , V * 2 , and V * 3 are calculated according to Formulas (35), ( 38) and (39), respectively, to obtain The coefficient matrix of the linear equations system obtained by d 1 is: From the coefficient matrix shown in Formula (41), it can be seen that the fifth column is the same as the seventh column, and the sixth column is the same as the eighth column.Note that the unknown elements corresponding to Columns 5 and 6 are x 2,1 and x 2,2 , respectively, and the unknown elements corresponding to Columns 7 and 8 are x 3,1 and x 3,2 , respectively.Let x ′ 1,1 = x 2,1 + x 3,1 and x ′ 1,2 = x 2,2 + x 3,2 ; then, the linear equa- tion system has six equations and six unknown elements.After taking the first, second, third, fourth, seventh and eighth columns of the matrix in Formula (41), the coefficient matrix of the linear equation system after elimination is obtained as follows: It is not difficult to verify that the coefficient matrix in Formula (42) is a full-rank matrix, so the linear equations system can be solved.The solution results are x 1,1 , x 1,2 , x 1,3 , x 1,4 , x

2 ,
of which the first four messages are all from sender s1, which is exactly what receiver d 1 needs, and the last two are interference messages.The coefficient matrix of the linear equations system obtained at receiver d